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Abstract 

In this paper, we study the notion of codes with hierarchical locality that is identified as another approach to local recovery 
from multiple erasures. The well-known class of codes with locality is said to possess hierarchical locality with a single level. 
In a code with two-level hierarchical locality, every symbol is protected by an inner-most local code, and another middle-level 
code of larger dimension containing the local code. We first consider codes with two levels of hierarchical locality, derive an 
upper bound on the minimum distance, and provide optimal code constructions of low field-size under certain parameter sets. 
Subsequently, we generalize both the bound and the constructions to hierarchical locality of arbitrary levels. 

Index Terms 

Codes with locality, locally recoverable codes, hierarchical locality, multiple erasures, distributed storage. 


I. Introduction 

An important desirable attribute in a distributed storage system is the efficiency in carrying out repair of failed nodes. Among 
many others, two important metrics to characterize efficiency of node repair are repair bandwidth, i.e., the amount of data 
download in the case of a node failure and repair degree, i.e., the number of helper nodes accessed for node repair. While 
regenerating codes |TJ aim to minimize the repair bandwidth, codes with locality j2) seek to minimize the repair degree. The 
focus of the present paper is on codes with locality. 


A. Codes with Locality 

An [n,k,d\ linear code C can possibly require to access k symbols to recover one lost symbol. The notion of locality of 
code symbols was introduced in [2j, with the aim of designing codes in such a way that the number of symbols accessed 
to repair a lost symbol is much smaller than the dimension k of the code. The code C is said to have locality r if the z-th 
code symbol c*, 1 < * < n can be recovered by accessing r « k other code symbols. In |2j, authors proved an upper bound 
on the minimum distance of codes with locality, and showed that an existing family of pyramid codes 0 can achieve the 
bound. In Q, authors extended the notion to (r, <5)-locality, where each symbol can be recovered locally even in the presence 
of an additional (6 — 2) erasures. In |j2j, authors introduced categories of information-symbol and all-symbol locality. In the 
former, local recoverability is guaranteed for symbols from an information set, while in the latter, it is guaranteed for every 
symbol. Explicit constructions for codes with all-symbol locality are provided in (5|, |6j, respectively based on rank-distance 
and Reed-Solomon (RS) codes. Improved bounds on the minimum distance of codes with all-symbol locality are provided in 
0- along with certain optimal constructions. Families of codes with all-symbol locality with small alphabet size (low field 
size) are constructed in (9j. Locally repairable codes over binary alphabet are constructed in |l(Fj. A new approach of local 
regeneration, where in repair is both local and in addition bandwidth-efficient within the local group, achievable by making 
use of a vector alphabet is considered in 0, {IT) , JTZ) . 

Recently, many approaches are proposed in literature 0 , 0 . a ED’ on to address the problem of recovering from 
multiple erasures locally. The notion of (r, ^(-locality introduced in [4l is one such. In 1131, an approach of protecting a single 
symbol by multiple support-disjoint local codes of the same length is considered. An upper bound on the minimum distance 
is derived, and existence of optimal codes is established under certain constraints. A similar approach is considered in f9| 
also. In |9|, authors allow multiple recovering sets of different sizes, and also provide constructions requiring field-size only 
in the order of block-length. Quite differently, authors of [7j| consider codes allowing sequential recovery of two erasures, 
motivated by the fact that such a family of codes allow a larger minimum distance. An upper bound on the minimum distance 
and optimal constructions for restricted set of parameters are provided. 


B. Our Contributions 

In the present paper, we study the notion of hierarchical locality that is identified as another approach to local recovery from 
multiple erasures. In consideration of practical distributed storage systems, Duminuco et al. in JT3J had proposed the topology 

This research is supported in part by the National Science Foundation under Grant No. 1422955 and in part by the joint UGC-ISF Research Grant No. 
1676/14. Birenjith Sasidharan would like to acknowledge the support of TCS Research Scholar Programme Fellowship. 
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Fig. 1: Illustration of [16,12,4]-code used in Windows Azure. 



Fig. 2: Illustration of [24,14, 6]-code having 2-level hierarchical locality. 


of hierarchical codes earlier. They compared hierarchical codes with RS codes in terms of repair-efficiency using real network- 
traces of KAD and PlanetLab networks. Their work was focused on collecting empirical data for performance improvements, 
rather than undertaking a theoretical study of such a topology. In the present paper, we study codes with hierarchical locality, 
first considering the case of two-level hierarchy. We derive an upper bound on the minimum distance and provide optimal code 
constructions under certain parameter-sets. This is further generalized to a setting of h-hierarchy in a straightforward manner. 

II. Codes with Hierarchical Locality 

The Windows Azure Storage solution employs a [16,12,4]-pyramid code with a locality parameter r = 6. In the code, 
as illustrated in Fig. [T] every code symbol except the global parities Pi,P 2 can be recovered accessing r = 6 other code 
symbols. While the code performs well in systems where single node-failure remains the dominant event, it requires to connect 
to k = 12 symbols to recover a failed under certain erasure-patterns consisting of 2 node-failures. We consider an example of 
[24,14,6]-code from the family of codes with hierarchical locality in an attempt to reduce such an overhead. The structure of 
the code is depicted in Fig. [2] as a tree in which each node represents a constituent code. The code contains two support-disjoint 
[ni = 12, ri = 8, d 2 = 3] codes, each of them in turn comprised of three support-disjoint [ri2 = 4, r2 = 3, d\ = 2] codes. 
Making use of [4,3, 2]-code, all single-erasures can be repaired accessing r 2 = 3 symbols, which is half the number of symbols 
required in the Windows Azure code in a similar situation. We can recover a lost symbol connecting to ri = 8 symbols in the 
case of erasure-pattern involving 2 erasures. This is in contrast to the Windows Azure code where we had to download the 
entire message of 12 symbols. While the Windows Azure code offers a storage overhead of 1.3x with a minimum distance of 
4, our code has a larger overhead of 1.7x with a better minimum distance d = 6. The example of [24,14, 6]-code can indeed 
be constructed, and it will be shown that the minimum distance is optimal among the class of codes. 

A. Preliminaries 

Definition 1: 0 An [n,k,d\ linear code C is a code with (r, (j)-locality if for every symbol Cj,l < * < n, there exists a 
punctured code C, such that Cj £ SuppfC,) and the following conditions hold: 1) dim(C, ) < r, 2) d min (Ci) > 6. 

Codes with locality were first defined in [2| for the case of <5 = 2, and the class was generalized for arbitrary 5 in 0). In the 
definition given in |4j, the authors imposed constraints on the length and the d mm of C t . We replace the constraint on length 
with a constraint on dim(Ci), and it may be noted that it does not introduce any loss in generality. The code Ci associated 
with the /-the symbol is referred to as its local code. If it is sufficient to have local codes only for symbols belonging to some 
fixed information set /, such codes are referred to as codes with information-symbol (r, S)-locality. The general class in Def. [T] 
is also referred to as codes with all-symbol (r, S)-locality, in order to differentiate them from the former. In this paper, unless 
otherwise mentioned, we consider codes with all-symbol locality. 

Definition 2: An [n, k,d] linear code C is a code with hierarchical locality having locality parameters [(ri, <5i), (r^,^)] if 
for every symbol Cj, 1 < i < n, there exists a punctured code C, such that Ci £ Supp(C t ) and the following conditions hold: 
1) dim(Ci) < 7"i, 2) d m i n (C'i) > S i, 3) Cj is a code with (r 2 ,<5 2 )-locality. 

The punctured code Ci associated with Cj is referred to as its middle code. Since the middle code is a code with locality, 
each of its symbols will in turn be associated with a local code. 


B. An Upper Bound On the Minimum Distance 

Theorem 2.1: Let C be an [n, k, d]-linear code with hierarchical locality having locality parameters [(^, 62 ), (ti,<5i)]. Then 


d < n — k + 1 — 




1 ) 




*2). 


( 1 ) 


Proof: We extend the techniques introduced in (2) in proving the theorem. A punctured code C s of C having dimension 
k — 1, is identified first. Then we will use the fact that 
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d < n- |Supp(C s )|. (2) 

The Algorithm [T] (see flow chart in App. [A} in is used to find C s with a large support. In each iteration indexed by j, the 
algorithm identifies a middle code from C, that accumulates additional rank. Then it picks up local codes from within the 
middle code that accumulate additional rank. Clearly, the algorithm terminates as the total rank is bounded by k. Let i en & and 
jend respectively denote the final values of the variables i and j before the algorithm terminates. Let a, denote the incremental 
rank and s, denote the incremental support while adding a local code L,. Then we have s, > a, 4 (62 1). 1 < i < i em j, 

since we have a* > 0 in every iteration. The set S, denotes the support of L,, and V, denotes the space Column-Space(G\s i ), 
where G is the generator matrix of the code. If no more local codes can be added from the middle code Mj, then the support 
of the last local code added from Mj is removed and an additional support Tj of Mj is added to T. Let i(j) denote the index 
of the last local code added from Mj. Since the middle code has a minimum distance of 6\ , and every rank accumulating 
local code brings at least one new information symbol, it follows that 

tj '■= \Tj\ > Oi(j) + (£1 — 1)> 

= di(j) + (S 2 - 1) + (<5l ~S 2 ) 1 <j< jend- 


Algorithm 1 For the proof of Thm. 


2.1 


l: Let j = 0, i = 0, W = (/>, = 0. 

2: while (3 a middle code Mj £ C such that rank(G|^uMj) > rank(G |^)) do 

3: while (3 a local code L t £ Mj such that V' C W) do 

4: W = W + Vi 

5: T = T 1 U Si 

6: i = i + 1 

7: end while 

8: V = (V\Si-i)UTj 

9: j =j + 1 

10 : end while 


The rank accumulates to k after adding the last local code L j end . We would also have visited j en d middle codes by then. 
Hence, 


2end ^ 


' k' 

i Jend ^ 

' k' 

— 

— 

T 2 


r 1 


(3) 


After adding L,. nd _ 1 local codes, we would have accumulated rank that is less than or equal to (k — 1). Hence we can always 
pick s e := (k — 1) — X)i=i _1 a * columns from Li mi so that the total rank accumulated becomes [k — 1). Note that s e > 0. The 
resultant punctured code is identified as C s . Let E = {i(j) \ 1 < j < jend}- Then 


®end 1 


tend J 

|Supp(C s )| > Y 

i(£E,i=l 


Jend 1 

Si+ s e + Y, tj- 
i=t 


(4) 


— l J — l 

In i[ 4 ]i, the last term Yjj= j -1 tj includes a sum of only j enc j — 1 terms because we could have possibly accumulated a rank 
[k — 1) after adding Li end _ 1 , i.e., s e = 0. Thus we have, 

^end 1 ^end 1 .7 end 1 


of 


*end 4 ^end 1 Jend 

\Supp(C s )\ > Y Si + (k - 1) - Y a i + Y 

i(£E,i=l i= 1 J=1 

^end 1 

Yj ( ai + “ 

i(£E,i =1 


tj 
7=1 

^end 1 


> 


Jend 1 


‘■end - 1 

i)) + (fc-i)- y 

2=1 

b(i) + (^2 — 1 ) + ($1 — ^2)) 

tend Jend 1 

Y — 1) + (fc -1) + Y ( 5 i - 

3 =1 


ai 


5 > 


7=1 

^end 1 


«a) 


2=1 












Substituting values of i en d and j en( j from ([3} and using Q, we obtain the bound. 

It may be noted that the theorem holds good even for codes with information-symbol hierarchical locality. 
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C. Code Constructions For Information-Symbol Locality 

A straightforward extension of pyramid codes |3j] is possible to construct optimal codes with information-symbol hierarchical 
locality. They achieve the bound in (|TJ if r 2 rj | /;:. In this section, we illustrate the construction with an example, assuming 
62 = 2. The two-level hierarchical code described here extends naturally to multiple-level hierarchy and yields optimal codes 
if r h | r h ~ 1 \ ■■■ \ n\ k. 

The construction is built on a systematic MDS code with parameters [k + d — 1 ,k,d] with a generator matrix G m d s . Let 


Partition Q as 


L~mds — [Rxk | Qkx(d— 1)] 

k = ar\ + / 3 r 2 +7, 0 < / 3 r 2 + 7 < r\ 
ri = fxr 2 + v, 0 < v < r 2 . 


Q = 


Q1 


Q2 

Q' 

Qa+l 



where Qi , 1 < * < a is of size r\ x (< 5 i — 1 ), Q a +1 is of size (/ 3 r 2 + 7) x (< 5 i — 1) and Q' is of size (k x (d — < 5 i)). Further 
partition Qi, 1 < * < a as 


Qr = 


where Rij, 1 < j < /r is of size r 2 x 1, Ri^+i is of size v x 1 and R! i is of size (77 x (<5i — 2)). At the same time Q a +1 is 
partitioned as 


Rh 


R 2 

R'i 

R-i.fl ■ 1 




Ra+ 1,1 


Qa-\-1 — 

Ra+ 1,2 

R'a + l 


Ra+ 1 , 0+1 



where R a+ ij, 1 < j < j3 is of size r 2 x 1, Ra+ 143+1 is of size 7 x 1 and R a+1 is of size ((/3r 2 + 7 ) x (<5-| — 2)). Next, we 
can construct matrices 

Ril 

Ri 2 


Qi = 


R 


4,71+1 


R', 


Q 


CK + l - 


Q = 


R, 


■a+1,1 


R, 


a+ 1,2 


R, 


a+1,^+1 


R. 


CK+1 


Q 1 


Q 2 


Q 


Q -+1 


Q' 


Let J be defined as 


J = 
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where I r2 is repeated fi times. Finally we construct the generator matrix G of the pyramid code as, 


J 


J 


I r2 

Q 

Ir 2 


[ I 7 



with J being repeated a times, and I r2 repeated /3 times. The resultant code has length 


n 


k T d —lT 






2 ). 


Clearly, minimum distance of the code corresponding to G is greater than or equal to that of G m d s , which is d. Also, G satisfies 
the property of information-symbol hierarchical locality by construction. Using the bound in <(TJ, the code is optimal if 


' k' 


>1’ 


' k~ 

ri 


f 2 


r 2 


D. Code Constructions For All-Symbol Locality 

We assume a divisibility condition 112 \ n± \ n. The construction is described in three parts. The first part involves identification 
of a suitable finite field F pm , a partition of F* m and a set of polynomials in F p m [X] that satisfy certain conditions. We require 
that every polynomial evaluates to a constant within one subset in the partition, and evaluates to zero in all the remaining 
subsets. In the second part, we construct a code polynomial c(X) from the message symbols with the aid of the suitably 
chosen polynomials. The code polynomial c(X) is formed in such a way that the locality constraints are satisfied. This part 
also involves precoding of message symbols in such a way that the dimension of the middle codes and the global code are 
kept to the desired values. Finally, the third part involves evaluation of the code polynomial at n points of F* m , that are chosen 
in the first part. 

1) Identification of F p m, a partition of F* m and a set of polynomials: Let the finite field F p m be such that ni | p m — 1, 
and n < p m . Existence of such a pair (p. m) is shown in App. [b] We define the integers 

mi 1 "0 n l 

n 0 =p -l,p,o = l,pi = —,/x 2 = —■ 

n\ n 2 

Let a denote the primitive element of F pm , and hence F* m = {1, a, a 2 ,..., a p "'~ 2 }. Set /3 0 = a and /3 X , /? 2 be elements of 
order ni and n 2 respectively. Then we have the following subgroups: 

Ho = F; m 

Hi = 

h 2 = 

We can further write 

H 0 = Hi W p 0 Hi a • ■ ■ W Pq 1 ~ 1 Hi 
Hi = H 2 ^fiiHi\S---\Sfi^~ l H 2 . 

Having set up a subgroup chain, we proceed to define a family of subsets of // () . These subsets are indexed by a tuple ( 1 , t) 

with i £ {0,1, 2}. For a given value of i, t takes values from the set T* = {(fo, ti,t 2 ) \ 1 < tj < p :j for j < i; tj = 0 for j > «}. 

For a given tuple (■ i,t ), let us define a coset t ) of the subgroup Hi as follows: 

i— 1 

7(»,*)=n T +i_l > = 7 

j=o 

The set of possible indices has a tree-structure with each index (i,t) associated with a unique vertex of the tree. A vertex (i,t) 
belongs to the i-th level of the tree, and the 3-tuple t describes the unique path from the vertex to the root of the tree. The 
parent of a vertex (i,t) is denoted by tt(i. f), and the set of its siblings, i.e, other vertices having the same parent, is denoted 
by t). The tree structure of ^ | i £ {0,1, 2}, t £ Tf\ is depicted in Fig. [ 3 ] Since each vertex at i-th level is associated 
with a coset of Hi, we refer to this tree as the coset-tree. 
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Fig. 3: Illustration of the coset-tree. 


Next, we define the polynomials pu t t)(X), q^^(X) £ F p m[X] as, 

p{i, t ){x)= n QaM x )= n Pii^x)- 

The polynomial pr it ^(X) is the annihilator of A^ it y Furthermore, the polynomials in {q(i t t){X) | 1 < tj < 
Pi,tj is fixed for j ^ i} are relatively prime collectively. Thus for i = 1,2 there exists {a^ i t )(X) | 1 < f, : < 
Pi,tj is fixed for j ^ i} such that 

E «(hf) ( x Mis ( x ) = 1 mod i * 71 *- 1 - E-L))’ (5) 

u =i 

tj fixed for j^i 


where s = n Next, we define Eu t \ (X) = a( i t ){X)q^ i t ){X), and determine a valid candidate for E( it )(X) in the next 
Lemma such that holds. Subsequently in Lem. |2.3| we will list down certain useful properties of these polynomials. The 
proof is relegated to Appendix. 

Lemma 2.2: 


E {i ,t)(X) 


n 

se-0(j,t) 


~ A^-( 7( ^r- - 

.(7 (i,t)) ni - (7 (i,s)) n \ ' 


(6) 


Lemma 2.3: Let i £ {1,2}, t, s £ T* and f £ t/>(i, s ). Let r = (tq, ri, ts) = tr(i, t). Then 


E(i,t)(X) 

= g(X ni ) for some polynomial 

(7) 


9 (■)> de s(g) = Pi- 1 


E {i ,t){0) 

( 1 9 £ A(i t ^(X) 

\ 0 9 £ \ 

(8) 

E {i ^(X)E {iS (X) 

= Omodpr- 1 -7^)) 

(9) 

El,t){X) = %*)(*) 

mod 

(10) 


Proof: The property (|7]i is clear from the definition of E( lj:j (X). The properties (|8j, ([9]i are clear from the proof of 
Lemma [2~2] Hence ( flOl ) follows by ([5]). ■ 

2) Construction of c{X): We start with associating message polynomials of degree ( 7-2 — 1) with certain leaves of the 
coset-tree. The total number of leaves of the coset-tree equals pip, 2 - However, we will only consider a suitable subtree of the 
coset-tree such that the number of leaves equals fiip .2 where /7| = The required subtree is obtained by removing the last 
(pi — /7 1 j branches emanating from the root of the tree. Every leaf that is retained in the subtree has an index (2. t) where t 
belongs to the set 


T 2 — {t £ T 2 I 1 < t\ < Pi}- 


This subtree is referred to as the relevant coset-tree. A vertex from the i-th level, i > 0 of the relevant coset-tree will have an 
index (i. t) where t £ T- = {t £ Ti | 1 < t\ < fi}. 

Consider a set U = {ut(X) = ut .0 + ut,\X +... + ut, r2 -iX r2 ~ 1 \ t £ Tf] of message polynomials of size p.i/j. 2 - The code 
polynomial c(X) is built from U in an iterative manner. In every iteration, we take as input a set of polynomials corresponding 
to vertices of the i-th level of the relevant coset-tree, and output another set of polynomials corresponding to vertices of 
the [i — l)-th level. As noted earlier, each leaf of the relevant coset-tree is uniquely mapped to a polynomial in U. In the 
end, we will identify a polynomial C(o,(i,o,o))PO associated with the root of the relevant coset-tree. The code polynomial 
c(X) = C(o,(i,o.o))PO- It may be noted that the polynomials in U is made up of fiip. 2 f 2 message symbols in total. However, 
the desired dimension k can be less than f \ i.iar-i- Hence in every iteration, a precoding of message symbols is carried out 
causing a reduction in the number of independent message symbols. The dimension would be reduced to the desired value k 
at the end of the final iteration. 
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Let us now start the iteration by setting c^ 2 ,t)(X) = Ut(X ) Vi £ T 2 . Evaluations of C( 2 ,t)(A) at 77,2 points in 41(2.^(X) gi ye 
rise to an [n 2 , r 2 ) -codeword. Recognizing this correspondence, we refer to C( 2t )(X),f £ T 2 as a second level code polynomial. 
In the next iteration, for every t £ T[ 


d( i,t)( X )= E c {2 ^(X)E {2 ^(X). 


s:7r (2,s) = (l,i) 


B y 0 in Lemma 


2.3 


the coefficient of X 1 is zero in E/ 2t ^{X) whenever l 7 ^ 0 (mod n 2 ). Hence for every t £ T[, there are of 


p, 2 r 2 monomials in d( lt ){ A'). Evaluations of d(i^(X) at rii points in A^i^(X) give rise to an [ni, /r 2 r 2 ]-codeword. Since the 
desired dimension of the middle code is r 1 , we precode the message symbols such that the coefficients of (r 2 /i 2 — r 1 ) highest 
degree monomials in dn t )(X) vanishes to zero. The polynomials cn Aj (X) thus obtained corresponds to an [m, ri]-middle 
code, and hence referred to as a first level code polynomial. We can write 


c ( i,t)(X) = E Pi(c ( 2,s)(X))E ( 2 ,s)(X), 

s :tt ( 2 , s) =(l,t) 


teli, 


where Pi(-) denotes the precoding transformation at the first level. In the next iteration, we compute d( 0 l ( ll 0 l o))(^0 and 
subsequently precode the message symbols by Pq(- ) to reduce the dimension from fi-\ r \ to k to obtain the zeroth level code 
polynomial C(o ) (i ) o,o))(A): 


^(0,(l,0,0))PO — E C (l>s)(^-)-®(l,2)(^) 

s:ir(l,s)=(0,(l,0,0)) 

seT( 

C(o,(i,o,o))(X) = E ^ > o(c(i,s)(X))E( li s- ) (X), 

s:tt(1,s)=(0,(1,0,0)) 

sGTi 

The code polynomial c(X) is identified as 

c(X)=c ( 0 ,( 1 , 0 , 0 )) (A). (ID 

3) Evaluation of c(X): The codeword c = (c(d) | 9 £ A) is obtained by evaluating the polynomial c(X) at n points 
taken from 

A = U A (ht)- 

teT[ 

This completes the description of the construction. By the construction, it is clear that the dimension and the minimum distance 
of the code are given by 

k = {£ coefficient of X f in c(A') ^ 0 } | 

d > n— deg(c(X)). 


Remark 1: A principal construction in (9j for codes with all-symbol locality, relies on a partitioning of the roots of unity 
contained in a finite field into a subgroup and its cosets. The construction then identifies polynomials that are constant on each 
coset and makes use of these polynomials in the construction. The approach adopted here is along similar lines. 

Example 1: In this example, we construct a code with ]n, k] = [24,14] having locality parameters (ni,ri) = (12,8) and 
(n 2 ,r 2 ) = (4,3), satisfying the divisibility condition. We can choose the finite field = F 52 . Let a be a primitive element 
of F 52 . We have no = n = 24, pi = = 2, and p 2 = 3. We set 


H 0 = F ; 2 

Hi = {1 .ft,#,...,# 1 } 

H 2 = {1,P 2 ,P 2 ,P 2 , 

where 0 O = a, = a 2 and /3 2 = a 6 . The relevant coset-tree can be computed as 


A (o, (1,0,0)) — Hq, ^ 4 (i,(1,1,0)) — Hi, Vl^ ^^o)) — P0H1 
A ( 2 ,(i, tl ,t 2 )) = 2 , 1 < fi < 2 ,1 < t 2 < 3. 

Let us dehne the index sets T) = {(f 0 , fi,f 2 ) | to = 1,1 < ft < 2, t 2 = 0}, T 2 = {(f 0 , fi,f 2 ) | to = 1,1 < fi < 2, 1 < f 2 < 3}. 
For every t = (1,<i, 0) £ Ti, we set si as the unique element in {1,2} \ {fi} and then we have 

a - 12 - (ir 1 ) 12 \ 

(/3o 1-1 ) 12 - Wo 1 - 1 ) 12 ) 

:= a t JX 12 + bt. 


(A) = 





Similarly for every t = (1, ti,t 2 ) £ T 2 , we set {si, S 2 } = {1,2, 3} \ {^ 2 } and then we have 


E m (X) = 


x 4 - (ft- 1 PI 1 - 1 ) 4 


(ft~ Pi r — (ft ft 

:= e t X s + f t X 4 + g t . 


si-1'14 


x 4 - (ft- 1 ??- 1 ) 4 


yj \(ft- L ft- L Y-(ftft~ y 


There are IT 2 I = 6 message polynomials denoted by {ut(X) = 0 + W.i X + u t 2 X 2 | t £ T 2 }, each of degree (r 2 — 1) = 2. 

The second level code polynomial for each t £ T 2 corresponding to a [4,3]-local code is taken to be c( 2 ,t)(X) = ut(X). In 
the next step, the first level code polynomial {c\ , S (X)} is constructed as 


S-2 —2 

Ct,t(X) = ^ P^ftX^esX 8 + fsX 4 + g§ ). 

s:si=ti,S2=l 


for each of t £ T). By virtue of the precoding I\ (■), the term A ' 10 vanishes and the resultant polynomial ci,t(X) corresponds 
to a [12, 8 ]-middle code. Subsequently, the zeroth level code polynomial is constructed as 

si=2 

C 0,(1,0,0)(X = -fo(c(l,s)(^))( a s-^ 12 + ^s)- 

s:si=l,S2=0 


Without precoding Pq(- ), we would have obtained a polynomial of degree 21 having 16 monomials. Precoding wipes out the 
terms {A 21 , A' 20 }, and the resultant polynomial 00 ,( 1 , 0,05 (A) =: c(X) of degree 18 is the code polynomial consisting of 14 
monomials. Thus k = 14, and d > 6 . The codeword c is given by c = ( c(0 ) | 9 £ H a ). 

It is of interest to look at the exponents of monomials in polynomials of each level. From each level, we pick a candidate 
polynomial c(X), Ci(X) := c ( 1 , ( 1 , 1 , 0 ) )(A), c 2 (X) := c ( 2 ,( 1 , 1 , 2 ))(A). 
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Fig. 4: Illustration of the exponents of monomials in 02 (A), ci(A') and c(X) in order. The canceled exponents are those whose 
coefficients are fixed to zero by precoding. 

The illustration in Fig. [4] gives an equivalent simplistic description of the [24,14, 6] code. This works in general. Let Exp(f) 
represent the ordered set of exponents of the monomials in a polynomial /(A). By ordered set, we mean that the elements 
of the set are listed in the descending order. For example, Exp( A 3 + aX + 1) = {3,1,0}. For an ordered finite set S of 
non-negative integers and a positive integer r < |5|, we define Trunc(S,r ) as the set comprising of the last r elements of the 
set. Then we have that 

Exp(c 2 ) = {r 2 — 1,^2 — 2, ... ,0} = Trunc(Z n2 ,r 2 ) 

H2 — 1 

Exp(ci) = Trunc( u (jn 2 + Exp(c 2 )),r 1 ) 

1=0 

fii—i 

Exp(c) = Trunc( (J (jm + Exp(a)), k), 
l=o 

where Z n = {0,1,..., n — 1}. The set Exp(c) is an equivalent simplistic description of the code. In terms of Exp(c), we can 
write the parameters of the code as k = Exp(c)|, d>n — max(Exp(c)). 


E. Locality Properties Of the Code 

In this section, we will show that the code satisfies locality constraints. Consider the case c(y) is lost. We need to recover 
it accessing n other symbols |c(t/i), 0 ( 1 / 2 ), c(y ri )} that along with c(y ) are part of an [ni,ri] punctured code. Without 
loss of generality, let us assume that y £ A( X ^ j 0 )). Using ([9]), (jTO]) in Lemma 

c(A)E(i,(i,i,o))(A) = P 0 (c(i,(i,i,o))(A'))E( 1 ,( 110 ))(A). 


we can write 


Evaluations at r 1 out of n\ points in ^4.(i,(i ; 1 ,o)) will help reconstruct -Po(c(i,(i,i,o))(X)), since deg(Po(c(i,(i.i,o))(-^))) < 
(ri — 1). Then we can recover c(y) = Po(cii,(i,i.o))(y))E(i,ti,i,o))(y)- The same argument can be used inductively to show 
that each symbol within an [m, ri]-middle code can be recovered by r 2 out of some «2 symbols. This establishes the existence 
of [ri 2 , ft -local codes. 
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F. Optimality Of the Code 

Theorem 2.4: The [n,k,d \-code with [m,ri, <5i]-middle codes and [n 2 , r 2 , <* 2 ]-local codes constructed in Sec. II-D achieves 
the optimal minimum distance if 7*2 | | k. 

Proof: Let c(X) denote the code polynomial. The proof follows from counting Z n in two different ways. We can write 

n = |Z„| = \Exp(c)\ + \Z n \Exp(c)\. (12) 

We have that |Exp(c)| = k. On the other hand, since r 2 | ri | k, we can count the number of exponents that are truncated 
while forming c as, 

|Z„\Exp(c)| = - 1 V<* 2 - 1) 


+ (<5i — <y 2 ) + (d — 1 ) 

Substituting back in ( [T2 | i, we conclude that the code is optimal. ■ 

Theorem 2.5: The [n, k, c?]-code with [m, ri, <5i]-middle codes and [n 2 , r 2 , 62 ]-local codes constructed in Sec. II-D achieves 
the optimal minimum distance if the following conditions hold: 


2 ) — = 

' n 1 


> + 

h 


k 

n _ 

k 

ri 

’ ri2 

T2 


+ 1 


Proof: Let c(X) denote the code polynomial. The proof is analogous to that of Thm. 2.4 The only difference lies in the 
count of |Z n \ Exp(c)|. Since d = n 2 + 62 , we obtain that, 

|Z n \ £xp(c)| = (— - 1W - 1) + (— - l) (tfi - <* 2 ) 


n 2 


n 1 


+ ((d-l)- (*2-1)) 


n 2 


= — - 2 )(S 2 - 1) + — - 1 (ft - 62 ) + (d - 1). 


n 1 


Hence the code is optimal if the second condition in the theorem holds. ■ 

While Thm. 2.4 Thm. |2. 5 1 provide optimality conditions that can be generalized to hierarchical locality of arbitrary levels, a 
subject of discussion in Sec. m the next theorem characterizes the conditions for optimality for two-level hierarchy without 
imposing any restrictions. 

Theorem 2.6: The [n,k,d \-code with [m, ri, <5i]-middle codes and [n 2 , r 2 , <* 2 ]-local codes constructed in Sec. II-D achieves 
the optimal minimum distance if 


k 

T 2 


- 1 ) ^ 

r -2 


- 1 


(13) 


Proof: Let c(X) denote the code polynomial. The proof is analogous to that of Thm. 2.4 It is possible to count the size 

of Z n \ Exp{c ) as 



— - 1 


-1 (<*2-l) + (d-l). 


The expression can be recast into the form 

" k ~ 


r 1 


-1 J (S 1 -S 2 ) + (d-l)+ 

A_ (\Ix 

r-2 V r i 


-11^ 
r 2 


- 1 


- 1 (fc - 1), 


thus leading to a value of d as in 

d = n — k + 1 — 

k_ 
r 2 


-! (<*!- <^ 2 ) 


-1) ^ 


?’2 


- 1 


-1 (< 52 - 1 ). 


Comparing against the upper bound in Q. we conclude that the code is optimal if (p~ 3 |) holds. 
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III. Generalization To h- Level Hierarchy 

In Sec. |g we considered codes with hierarchical locality where the hierarchy had two levels. Here we extend the notion to 
/(-level hierarchy where h is an arbitrary number. 

Definition 3: An [n, k, d] linear code C is a code with h-level hierarchical locality having locality parameters 
[(ri, (5i), (r 2 , (> 2 ), ■ ■ • (Th , <5h)] if for every symbol Cj, 1 < i < n, there exists a punctured code Ci such that Cj £ Supp(Ci) and 
the following conditions hold: 

1) dim(Ci ) < n , 

2) d mia (a) > S 1 , 

3) Ci is a code with {h — l)-level hierarchical locality having locality parameters [( 7 - 2 , S 2 ), (r 3 , ^ 3 ),... (r^, 5^)]. 

Code with 1-level hierarchical locality is defined to be code with locality. 

The punctured code Cj associated with Cj is referred to as local code of level-1. In fact, each symbol is associated with 
a bunch of local codes, each of level-/, i = 1,2 ,,h. In the previous section, we studied codes with 2-level hierarchical 
locality. 


A. An Upper Bound on The Minimum Distance 


Theorem 3.1: Let C be an [n, k, d ]-linear code with /(-level hierarchical locality having locality parameters 

[(n, (5i), (r 2 , S 2 ),..., {r h , S h )}. Then 


l—h -1 


d < n—k + 1 — ^ 


— 1) (^ — ^+1) 


k 

Th 


- 1 (4 - !)• 


(14) 


Proof: The proof is a straightforward extension of that of Thm. |2.l| The algorithmic (see flow chart in Fig. |5(b)| i identifies 
a (k— 1)-dimensional punctured code C s of C, having large support. Then we will use the Fact in The algorithm identifies 
a level-1 code that accumulates rank, and subsequently visits a level-2 code from within that, and continues recursively upto 
reaching a level-(/i — 1) code that accumulates rank. Then it picks up all the level-/i codes that accumulate rank. If no more 
level-/i codes can be picked up, it steps back one level up, and finds a new level-(/i — 1) code that accumulates rank. This 
can be viewed as a depth-first search for rank-accumulating level-/( codes. At each level, incremental support is added to the 
variable T. Vaguely speaking, the incremental support that is added at each level depends on the minimum distance of the 
code at that level. Let a ih denote the incremental rank and s lh denote the incremental support while adding a level-/i code 


Algorithm 2 For the proof of Thm. 


3.1 


1: Let h = 0, * 2 = 0,. .., i h = 0; (& = £ = 1; 

2: Mq = C, Mi = d>, M 2 = $,..., M h = $ 

3: while (rank(G |^) < k) do 

4: if (3 a level-f code Li e £ Me -1 such that rank(G\:i, u s U pp(Li )) > rank(G\^)) then 

5: Me = L ie 

6: if ((: equals h) then 

7: ^ = 'T U Supp(Mh) 

8: S'last = Supp(M h ) 

9: ih = ih + 1 

10: else 

11 : £ = £ + 1 

12: end if 

13: else 

14: £ = £ - 1; 

15: T = (T- \ S'last) u Te 

16: S'last = Te 

17: i t = ie + 1 

18: end if 

19: end while 


Li h . By the algorithm, a ih > The set Si h denotes the support of the level-// code L lh . The set Te denotes the incremental 
support of the level-/? code Me, along with the columns from the last code that accumulated rank. It will contain at least 
(de — 1) columns in addition to the incremental rank. Let ie m: denote the final value of the variables ie, 1 < £ < h before the 
algorithm terminates. Since ai h > 0, we can get non-trivial lower bounds on Sj h and te := 7) quite similar to the proof of 
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Thm. 12.11 The rank is accumulated to k after adding the last local code L, h . By this time, we would have also visited 
level-/ codes. Hence clearly. 


*4„d > 


k 

re 


1 <i<h 


After adding i ftend — 1 level-/* codes, we would have accumulated rank that is less than or equal to (k — 1). Hence we can 
always pick (k — 1) — l a ih columns from Li h so that the total rank accumulated becomes (fc — 1). The resultant 
punctured code is identified as C s . Following a similar line of arguments as in the proof of Thm. |2.1| we can get an estimate 
on Supp(C s ) as 


Hence the theorem follows. 


t=h -1 


|S 1 > (*-!)+ E 


e-i 


— 1 ) (de — de + i) 


k 

Th 


-1 ( 4 - 1 ). 


B. Code Construction For All-Symbol Locality 


The construction in Sec. II-D of the main text can be generalized to construct codes with h- level hierarchical locality 
containing [m, ri, A;] codes as /-th level code for each i = 1.2,.... h. Here also, we require to satisfy a divisibility condition 
rih | rih -i | • • • | n. The generalization is straightforward, and it boils down to finding a finite field F p ™ such that rih \ nh -i | 
• • • | ni | ( p m — 1), and n < ( p m — 1). Then we can find a subgroup chain H^ C H^-i C • • ■ C H 0 = F* m . This allows us 
to create a coset-tree of depth h, and the code construction follows naturally. It can also be proved that the construction thus 
obtained will be optimal in terms of minimum distance if either of the two conditions holds: 

1) r h | r 2 | ••• | rr | k. 


2) d = n h + 5 h ,^ h = 


+ 1,^ = 


Vi = 1,2,.... /i — 1. 
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Appendix A 

Flowcharts For Algorithms Used to Derive Bounds 


The flow charts for algorithms [I] and [2] are shown in Fig. 5(a) and Fig. 5(b) respectively. 
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START 



(a) Algorithm [T] (b) Algorithm [2] 


Fig. 5: Flowchart for algorithms uses for proving the bounds on minimum distance. 


Appendix B 

Existence Of Required Field 

First we will show that there exists a prime p such that m \ p — 1. By Dirichlets theorem, if a and d are two co-prime 
numbers, then the sequence a, a + d,a + 2d,... will contain infinitely many primes. By setting a = n \ + 1 and d = m, we 
observe that there are infinitely many primes of the form (ni + 1) + iri\, i.e. of the form (£ + l)ni + 1. Thus we obtain a 
prime p such that n\ | p — 1. If n < p — 1, we are done. If not, pick a sufficiently large m such that n < p m . Since n\ \ p — 1, 
we must also have n\ \ p m — 1. 


It is sufficient to verify that 


Appendix C 
Proof Of Lemma I2721 


| E {iS (X) (15) 

Vi 

E %,t)W = lmod(X”‘+ 1 - 7 EE)- (16) 

ti = 1 

tj fixed for j^i 

where r is the unique element such that (i — 1, r) = tt(i, t) for every t participating in the summation. For every such t the 
roots of q(i } t){X) are precisely 

-^■(1,4) Ttl—1 ,tI Ej —1 \ (17) 


It can also be checked that E( i t XX) evaluates to zero at any point in A / it y Hence q^ ^(X)) \ Eu^(X). It can also be 
seen that at any point y £ jT )iTi_i, all except one term in the L.H.S. of > [T6| ) evaluates to zero, and the remaining term 
evaluates to 1. Hence ([T6l> holds, thereby completing the proof. 
























